The prosecutor's fallacy is a fallacy of statistical reasoning made in law where the context in which the accused has been brought to court is falsely assumed to be irrelevant to judging how confident a jury can be in evidence against them with a statistical measure of doubt. If the defendant was selected from a large group because of the evidence under consideration, then this fact should be included in weighing how incriminating that evidence is. Not doing so is a base rate fallacy. This fallacy usually results in assuming that the prior probability that a piece of evidence would implicate a randomly chosen member of the population is equal to the probability that it would implicate the defendant.
One form of the fallacy results from misunderstanding conditional probability and neglecting the prior odds of a defendant being guilty before that evidence was introduced. When a prosecutor has collected some evidence (for instance a DNA match) and has an expert testify that the probability of finding this evidence if the accused were innocent is tiny, the fallacy occurs if it is concluded that the probability of the accused being innocent must be comparably tiny. The probability of innocence would only be the same small value if the prior odds of guilt were exactly 1:1. If the accused is otherwise totally unconnected to the case, and is only in the courtroom due to that DNA evidence then we should consider a much lower prior probability of guilt, such as the overall rate of offenders in the populace.
The fallacy can arise from multiple testing, such as when evidence is compared against a large database. The size of the database elevates the likelihood of finding a match by pure chance alone; i.e., DNA evidence is soundest when a match is found after a single directed comparison because the existence of matches against a large database where the test sample is of poor quality (common for recovered evidence) is very likely by mere chance.
The terms "prosecutor's fallacy" and "defense attorney's fallacy" were originated by William C. Thompson and Edward Schumann in the 1987 article Interpretation of Statistical Evidence in Criminal Trials, subtitled The Prosecutor's Fallacy and the Defense Attorney's Fallacy.[1][2]
Contents |
Argument from rarity – Consider this case: a lottery winner is accused of cheating, based on the improbability of winning. At the trial, the prosecutor calculates the (very small) probability of winning the lottery without cheating and argues that this is the chance of innocence. The logical flaw is that the prosecutor has failed to account for the low prior probability of winning in the first place.
Berkson's paradox - mistaking conditional probability for unconditional - led to several wrongful convictions of British mothers, accused of murdering two of their children in infancy, where the primary evidence against them was the statistical improbability of two children dying accidentally in the same household (under "Meadow's law"). Though multiple accidental (SIDS) deaths are rare, so are multiple murders; with only the facts of the deaths as evidence, it is the ratio of these (prior) improbabilities that gives the correct "posterior probability" of murder.[3]
In another scenario, a crime-scene DNA sample is compared against a database of 20,000 men. A match is found, that man is accused and at his trial, it is testified that the probability that two DNA profiles match by chance is only 1 in 10,000. This does not mean the probability that the suspect is innocent is 1 in 10,000. Since 20,000 men were tested, there were 20,000 opportunities to find a match by chance.
Even if none of the men in the database left the crime-scene DNA, a match by chance to an innocent is more likely than not. The chance of getting at least one match among the records is:
So, this evidence alone is an uncompelling data dredging result. If the culprit was in the database then he and one or more other men would probably be matched; in either case, it would be a fallacy to ignore the number of records searched when weighing the evidence. "Cold hits" like this on DNA databanks are now understood to require careful presentation as trial evidence.
Finding a person innocent or guilty can be viewed in mathematical terms as a form of binary classification. If E is the observed evidence, and I stands for "accused is innocent" then consider the conditional probabilities:
With forensic evidence, P(E|I) is tiny. The prosecutor wrongly concludes that P(I|E) is comparatively tiny. (The Lucia de Berk prosecution is accused of exactly this error,[4] for example.) In fact, P(E|I) and P(I|E) are quite different; using Bayes' theorem:
Where:
The prosecutor is claiming a negligible chance of innocence, given the evidence, implying Odds(I|E) -> P(I|E), or that:
A prosecutor conflating P(I|E) with P(E|I) makes a technical error whenever Odds(I) >> 1. This may be a harmless error if P(I|E) is still negligible, but it is especially misleading otherwise (mistaking low statistical significance for high confidence).
In the courtroom, the prosecutor's fallacy typically happens by mistake,[5] but deliberate use of the prosecutor's fallacy is prosecutorial misconduct and can subject the prosecutor to official reprimand, disbarment or criminal punishment.
In the adversarial system, lawyers are usually free to present statistical evidence as best suits their case; retrials are more commonly the result of the prosecutor's fallacy in expert witness testimony or in the judge's summation.[6]
Suppose there is a one-in-a-million chance of a match given that the accused is innocent. The prosecutor says this means there is only a one-in-a-million chance of innocence. But if everyone in a community of 10 million people is tested, one expects 10 matches even if all are innocent. The defense fallacy would be to reason that "10 matches were expected, so the accused is no more likely to be guilty than any of the other matches, thus the evidence suggests a 90% chance that the accused is innocent." and "As such, this evidence is irrelevant.".
The first part of the reasoning would be correct only in the case where there is no further evidence pointing to the defendant. On the second part, Thompson & Schumann wrote that the evidence should still be highly relevant because it "drastically narrows the group of people who are or could have been suspects, while failing to exclude the defendant" (page 171).[1][7]
A version of this fallacy arose in the O. J. Simpson murder trial: crime scene blood matched Simpson's with characteristics shared by 1 in 400 people. The defense argued that a football stadium could be filled with Angelenos matching the sample, so the evidence was useless.[8] Since there were fewer plausible suspects than the population of Los Angeles, that argument was fallacious.[9]
Sally Clark, a British woman who was accused in 1998 of having killed her first child at 11 weeks of age, then conceived another child and allegedly killed it at 8 weeks of age. The prosecution had expert witness Sir Roy Meadow testify that the probability of two children in the same family dying from SIDS is about 1 in 73 million. That was much less frequent than the actual rate measured in historical data - Meadow estimated it from single-SIDS death data, and the assumption that the probability of such deaths should be uncorrelated between infants. [10]
Meadow acknowledged that 1-in-73 million is not an impossibility, but argued that such accidents would happen "once every hundred years" and that, in a country of 15 million 2-child families, it is vastly more likely that the double-deaths are due to Münchausen syndrome by proxy than to such a rare accident. However, there is good reason to suppose that the likelihood of a death from SIDS in a family is significantly greater if a previous child has already died in these circumstances (a genetic predisposition to SIDS is likely to invalidate that assumed statistical independence[11]) making some families more susceptible to SIDS and the error an outcome of the ecological fallacy.[12] The likelihood of two SIDS deaths in the same family cannot be soundly estimated by squaring the likelihood of a single such death in all otherwise similar families.[13]
1-in-73 million greatly underestimated the chance of two successive accidents, but, even if that assessment were accurate, the court seems to have missed the fact that the 1-in-73 million number meant nothing on its own. As an a priori probability, it should have been weighed against the a priori probabilities of the alternatives. Given that two deaths had occurred, one of two possible explanations must be true, and both of these are a priori extremely improbable:
It's unclear that an estimate for the second possibility was ever proposed during the trial, or that the comparison of these two probabilities was understood to be the key estimate to make in the statistical analysis of the case.
Mrs. Clark was convicted in 1999, resulting in a press release by the Royal Statistical Society which pointed out the mistakes.[14]
In 2002, Ray Hill (Mathematics professor at Salford) attempted to accurately compare the chances of these two possible explanations; he concluded that successive accidents are between 4.5 and 9 times more likely than are successive murders, so that the a priori odds of Clark's guilt were between 4.5 to 1 and 9 to 1 against.[15]
A higher court later quashed Sally Clark's conviction, on other grounds, on 29 January 2003. However, Sally Clark, a practising solicitor before the conviction, developed a number of serious psychiatric problems including serious alcohol dependency and died in 2007 from alcohol poisoning. [16]